The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
The capture and animation of human hair are two of the major challenges in the creation of realistic avatars for the virtual reality. Both problems are highly challenging, because hair has complex geometry and appearance, as well as exhibits challenging motion. In this paper, we present a two-stage approach that models hair independently from the head to address these challenges in a data-driven manner. The first stage, state compression, learns a low-dimensional latent space of 3D hair states containing motion and appearance, via a novel autoencoder-as-a-tracker strategy. To better disentangle the hair and head in appearance learning, we employ multi-view hair segmentation masks in combination with a differentiable volumetric renderer. The second stage learns a novel hair dynamics model that performs temporal hair transfer based on the discovered latent codes. To enforce higher stability while driving our dynamics model, we employ the 3D point-cloud autoencoder from the compression stage for de-noising of the hair state. Our model outperforms the state of the art in novel view synthesis and is capable of creating novel hair animations without having to rely on hair observations as a driving signal.
translated by 谷歌翻译
The explosive growth of dynamic and heterogeneous data traffic brings great challenges for 5G and beyond mobile networks. To enhance the network capacity and reliability, we propose a learning-based dynamic time-frequency division duplexing (D-TFDD) scheme that adaptively allocates the uplink and downlink time-frequency resources of base stations (BSs) to meet the asymmetric and heterogeneous traffic demands while alleviating the inter-cell interference. We formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP) that maximizes the long-term expected sum rate under the users' packet dropping ratio constraints. In order to jointly optimize the global resources in a decentralized manner, we propose a federated reinforcement learning (RL) algorithm named federated Wolpertinger deep deterministic policy gradient (FWDDPG) algorithm. The BSs decide their local time-frequency configurations through RL algorithms and achieve global training via exchanging local RL models with their neighbors under a decentralized federated learning framework. Specifically, to deal with the large-scale discrete action space of each BS, we adopt a DDPG-based algorithm to generate actions in a continuous space, and then utilize Wolpertinger policy to reduce the mapping errors from continuous action space back to discrete action space. Simulation results demonstrate the superiority of our proposed algorithm to benchmark algorithms with respect to system sum rate.
translated by 谷歌翻译
来自计算机断层扫描血管造影(CTA)的肾脏结构分割对于许多计算机辅助的肾脏癌治疗应用至关重要。肾脏解析〜(KIPA 2022)挑战旨在建立细粒度的多结构数据集并改善多个肾脏结构的分割。最近,U-NET主导了医疗图像分割。在KIPA挑战中,我们评估了几个U-NET变体,并选择了最终提交的最佳模型。
translated by 谷歌翻译
我们提出了神经链,这是一个新颖的学习框架,用于对多视图图像输入进行准确的头发几何形状和外观进行建模。从任何观点都具有高保真视图依赖性效果,可以实时渲染学习的头发模型。我们的模型可实现直观的形状和风格控制,与体积同行不同。为了实现这些特性,我们提出了一种基于神经头皮纹理的新型头发表示,该神经头皮纹理编码每个Texel位置的单个链的几何形状和外观。此外,我们基于学习的头发链的栅格化引入了一个新型的神经渲染框架。我们的神经渲染是链的和抗氧化的,使渲染视图一致且逼真。将外观与多视图几何事先结合在一起,我们首次启用了外观的联合学习和从多视图设置的显式头发几何形状。我们证明了我们的方法在各种发型的忠诚度和效率方面的功效。
translated by 谷歌翻译
车辆到所有(V2X)通信技术使车辆与附近环境中许多其他实体之间的协作可以从根本上改善自动驾驶的感知系统。但是,缺乏公共数据集极大地限制了协作感知的研究进度。为了填补这一空白,我们提出了V2X-SIM,这是一个针对V2X辅助自动驾驶的全面模拟多代理感知数据集。 V2X-SIM提供:(1)\ hl {Multi-Agent}传感器记录来自路边单元(RSU)和多种能够协作感知的车辆,(2)多模式传感器流,可促进多模式感知和多模式感知和(3)支持各种感知任务的各种基础真理。同时,我们在三个任务(包括检测,跟踪和细分)上为最先进的协作感知算法提供了一个开源测试台,并为最先进的协作感知算法提供了基准。 V2X-SIM试图在现实数据集广泛使用之前刺激自动驾驶的协作感知研究。我们的数据集和代码可在\ url {https://ai4ce.github.io/v2x-sim/}上获得。
translated by 谷歌翻译
几乎可以肯定(或使用概率)满足安全限制对于在现实生活中的增强学习(RL)的部署至关重要。例如,理想情况下,平面降落和起飞应以概率为单位发生。我们通过引入安全增强(SAUTE)马尔可夫决策过程(MDP)来解决该问题,在该过程中,通过将其扩大到州空间并重塑目标来消除安全限制。我们表明,Saute MDP满足了Bellman方程,并使我们更加接近解决安全的RL,几乎可以肯定地满足。我们认为,Saute MDP允许从不同的角度查看安全的RL问题,从而实现新功能。例如,我们的方法具有插件的性质,即任何RL算法都可以“炒”。此外,国家扩展允许跨安全限制进行政策概括。我们最终表明,当约束满意度非常重要时,SAUTE RL算法的表现可以胜过其最先进的对应物。
translated by 谷歌翻译
我们考虑临床应用异常定位问题。虽然深入学习推动了最近的医学成像进展,但许多临床挑战都没有完全解决,限制了其更广泛的使用。虽然最近的方法报告了高的诊断准确性,但医生因普遍缺乏算法决策和解释性而涉及诊断决策的这些算法,这是关注这些算法。解决这个问题的一种潜在方法是进一步培训这些模型,以便除了分类它们之外,除了分类。然而,准确地进行这一临床专家需要大量的疾病定位注释,这是对大多数应用程序来实现昂贵的任务。在这项工作中,我们通过一种新的注意力弱监督算法来解决这些问题,该弱势监督算法包括分层关注挖掘框架,可以以整体方式统一激活和基于梯度的视觉关注。我们的关键算法创新包括明确序号注意约束的设计,实现了以弱监督的方式实现了原则的模型培训,同时还通过本地化线索促进了产生视觉关注驱动的模型解释。在两个大型胸部X射线数据集(NIH Chescx-Ray14和Chexpert)上,我们展示了对现有技术的显着本地化性能,同时也实现了竞争的分类性能。我们的代码可在https://github.com/oyxhust/ham上找到。
translated by 谷歌翻译
捕获和渲染寿命状的头发由于其细微的几何结构,复杂的物理相互作用及其非琐碎的视觉外观而特别具有挑战性。灰色是可信的头像的关键部件。在本文中,我们解决了上述问题:1)我们使用一种新的体积发型,这是成千上万的基元提出的。通过构建神经渲染的最新进步,每个原始可以有效地渲染。 2)具有可靠的控制信号,我们呈现了一种在股线水平上跟踪头发的新方法。为了保持计算努力,我们使用引导毛和经典技术将那些扩展到致密的头发罩中。 3)为了更好地强制执行我们模型的时间一致性和泛化能力,我们使用体积射线前导,进一步优化了我们的表示光流的3D场景流。我们的方法不仅可以创建录制的多视图序列的真实渲染,还可以通过提供新的控制信号来为新的头发配置创建渲染。我们将我们的方法与现有的方法进行比较,在视点合成和可驱动动画和实现最先进的结果。
translated by 谷歌翻译
强化学习(RL)涉及在未知系统中执行探索性动作。这可以将学习代理放在危险且潜在的灾难性系统中。当前在RL中解决安全学习的方法同时权衡了安全探索和任务实现。在本文中,我们介绍了新一代的RL求解器,这些求解器学会最大程度地减少安全性违规行为,同时在安全政策可以容忍的范围内最大化任务奖励。我们的方法引入了一个新型的两人框架,用于安全RL,称为分配探索安全培训算法(DESTA)。 DESTA的核心是两种自适应代理之间的游戏:安全代理,其任务是最大程度地减少安全违规行为和任务代理,其目标是最大程度地提高环境奖励。具体而言,安全代理可以在任何给定点有选择地控制系统,以防止任务代理在任何其他州自由执行其策略时违反安全性。该框架使安全代理能够学会在培训和测试时间中最大程度地减少未来安全违规行为的某些行动,而任务代理人执行的动作可以最大程度地提高其他任何地方的任务绩效。从理论上讲,我们证明DESTA会汇合到稳定的点,从而最大程度地违反了对预验证的政策的行为。从经验上讲,我们表明了DESTA提高现有政策安全性的能力,其次,当对任务代理和安全代理人同时培训时,构建安全的RL政策。我们展示了DESTA在Lunar Lander和Openai Gym的Frozen Lake中的领先RL方法的出色表现。
translated by 谷歌翻译